The Spoken BNC2014

نویسندگان

چکیده

Abstract This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthographically transcribed conversations among L1 speakers English from across UK, recorded in years 2012–2016. After showing that a survey recent history corpora spoken justifies compilation this new corpus, we describe main stages BNC2014’s creation: design, data and metadata collection, transcription, XML encoding, annotation. In doing so aim to (i) encourage users approach with sensitivity many methodological issues identified attempted overcome while compiling BNC2014, (ii) inform (future) compilers innovations implemented attempt make construction representing spontaneous speech informal contexts more tractable, both logistically practically, than past.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

the role of thematic structure in comprehending spoken language

in fact this study is concerned with the relationship between the variation in thematice structure and the comprehension of spoken language. so the study focused on the following questions: 1. is there any relationship between thematic structure and the comprehension of spoken language? 2. which of the themes would have greated thematic force and be easier for the subjects to comprehend? accord...

15 صفحه اول

On the Use of Diary Study to Investigate Avoidance Strategy in Spoken English Courses

In the present study, an attempt is made to investigate the frequency and motives of using avoidance strategies by a group of Iranian intermediate language learners through their own journal writing. The effect of gender on the use of avoidance strategies is to be investigated as well. Thirty nine female and twenty three male learners enrolled in an English language spoken course in a private E...

متن کامل

Introduction: Compiling and analysing the Spoken British National Corpus 2014

For over twenty years, the British National Corpus has been one of the most widely known and used corpora. It is almost impossible to attend an international corpus linguistics conference such as Corpus Linguistics, ICAME (International Computer Archive of Modern and Medieval English), AACL (American Association for Corpus Linguistics) or APCLC (Asia Pacific Corpus Linguistics Conference) witho...

متن کامل

Navigating the Spoken Wikipedia

The Spoken Wikipedia project unites volunteer readers of encyclopedic entries. Their recordings make encyclopedic knowledge accessible to persons who are unable to read (out of alexia, visual impairment, or because their sight is currently occupied, e. g. while driving). However, on Wikipedia, recordings are available as raw audio files that can only be consumed linearly, without the possibilit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Corpus Linguistics

سال: 2022

ISSN: ['1569-9811', '1384-6655']

DOI: https://doi.org/10.1075/ijcl.22.3.02lov